home *** CD-ROM | disk | FTP | other *** search
- Path: svnews.ubinet.ubs.com!ubszh!ian.johnston@ubs.com
- From: ian.johnston@ubs.com (Ian Johnston (by ubsswop))
- Newsgroups: comp.lang.c++
- Subject: Re: Would/Won't you use a garbage collector?
- Date: 10 Apr 1996 11:34:48 GMT
- Organization: UBS
- Distribution: world
- Message-ID: <4kg6co$2h3@ubszh.fh.zh.ubs.com>
- References: <4kamie$e4d@dfw-ixnews3.ix.netcom.com>
- NNTP-Posting-Host: nol2179.fh.zh.ubs.com
-
- In article <4kamie$e4d@dfw-ixnews3.ix.netcom.com>, giuliano@ix.netcom.com(Giuliano Carlini) writes:
- |> I'm a long time proponent of using garbage collection in C and C++
- |> programs, and I'm curious:
- |> - How many others are there?
- |> - Why don't most C/C++ programmers use it?
- |> I'm particularly interested in finding out why most C/C++ don't use it.
- |> While I have my own theories - which I'll describe below - I'm
- |> interested
- |> in finding out more directly from those who are against it.
-
- [...]
-
- |> What follows is my belief for why garbage collection is so little used.
- |> Feel free to respond to anything I say below, but please, first respond
- |> to the questions above. I believe that most people don't use garbage
- |> collection because either they:
- |> - don't know what it is
- |> - don't know it can be used with C/C++.
- |> - are misinformation
- |> - are biased against it by the C/C++ culture
- |> In my experience, most C/C++ programmers either don't know what garbage
- |> collection is, or don't know that it can be used with C/C++. After all,
- |> no major C/C++ compiler includes a garbage collector. At least, as far
- |> as I know. I hope I'm wrong, and that someone can correct me. But even
- |> after, I tell them what it is, and that it can be used with C++, almost
- |> everyone still rejects it.
- |>
- |> At first, most offer technical reasons for rejecting it. Almost all are
- |> based on misinformation, since garbage collection is usable and
- |> benificial
- |> for the vast majority of systems.
-
- I think the reasons you give are correct. I also agree that for many (simple)
- systems, garbage collection (GC) is a good idea.
-
- I would have no objection to GC being the default for C++,
- provided that I could switch it off and incur *minimal penalty* by
- overriding the GC. That is, minimal performance penalty induced
- by the garbage collector, even though it is not used. A performance
- penalty equivalent to virtual vs non-virtual function calls would be acceptable.
-
- Remember too, that GC in C or C++ is *hard*. C has been around
- for 25 years or so, and C++ for about 15; it is only relatively recently that
- efficient garbage collectors have appeared for C and/or C++. It is not so much
- that the culture set out biased against GC, but the culture has probably
- grown that way, for the first of the three reasons you give above.
-
- Now, do I use GC? Not in the systems I write for a living. Here's why.
-
- First, I write servers that are intended to run for a long time, perhaps 3 months,
- perhaps 6 months, perhaps a year. These servers are constantly allocating and
- freeing objects. To use GC, I need a collector that collects 100% of the dead
- objects: not 95%, not 99%, not even 99.9%. I don't know of any collector for
- C++ that can give this guarantee.
-
- Let's say a server averages 100 object creations per second. In a 10 hour day,
- that's 3.6 million objects. Say a collector leaks 0.1% of all objects. That's
- 3600 objects per day. At an average of 2k per object, that's 7.3MB of memory
- leaked per day.
-
- Second, I write multi-threaded programs. It's not clear to me how GC works
- in a multi-threaded environment. Can current collectors handle one thread
- allocating, and a different thread freeing? Or will the apparently dead
- object in the allocator thread be collected, even though it is still in use
- by other threads? The answer has to be "no" if GC is to work in a multi-threaded
- environment.
-
- Third, in the code I write, I use a variety of helper classes to help manage
- memory (and other resources; see below). I don't tend to use C-style arrays
- (stack or heap-based). I don't tend to use raw C++ pointers. These things
- dramatically reduce the potential for bugs. In addition, I make frequent use
- of customised memory allocators to increase performance of allocating/releasing
- space for objects. Sometimes I use shared memory. Sometimes I use statically
- allocated memory. Sometimes I use heap memory.
-
- Unfortunately, there are leaks in the C libraries I use. If a 100% reliable
- GC could somehow be confined to the library, that would make life easier. As it
- is, I have to pick and choose the C library routines I use. Sadly, some can't
- be avoided. This is an argument for GC, rather than against :-)
-
- Fourth, the code I write lives in a mixed-language environment. My C++ libraries
- are linked with main programs written in C or Ada. While GC might survive
- across a C/C++ boundary, it is not clear to me that it would survive across
- an Ada/C++ boundary. It has been a significant effort to craft my C++
- libraries so that they can run reliably without relying on static
- constructors being called!
-
- These are the practical reasons. There is another, major reason which is
- partly practical and partly philosphical.
-
- As you point out, memory is a resource. But it is not the only resource
- my software uses. There are other resources that are in relatively
- short supply: file handles, network connections, semaphores, even threads in
- some cases.
-
- I don't understand why GC should be applied only to memory. If it is important
- to automatically reclaim unused allocated memory, why is it not important to
- automatically reclaim unused file handles, or unused network connections?
-
- An important, and extremely useful idiom, in C++, is the technique of acquiring
- a resource in a constructor, and releasing the resource in the destructor.
- I have come to use this idiom very heavily, and it has made my code much simpler,
- much less prone to mistakes, much more robust in the presence of exceptions, and
- much more maintainable.
-
- Here's an example:
-
- class AutoLock
- {
- public:
- AutoLock(Mutex &m)
- : mtx(m)
- {
- mtx.lock();
- }
-
- ~AutoLock()
- {
- mtx.unlock();
- }
-
- private:
- Mutex &mtx;
- };
-
-
- Locking and unlocking a mutex at precisely the right times is critical to
- maximising concurrency and robustness in multi-threaded applications. If
- somehow the destruction of this AutoLock were left to a GC system, I would
- lose control over unlocking the mutex; this would be a disaster for concurrency.
- I simply cannot afford to let the system decide, at some point in the future,
- to release the mutex.
-
- I could of course, replace this:
-
-
- void someFunc()
- {
- AutoLock lock(someMutex);
- manipulate(someObject);
- }
-
-
- with this:
-
-
- void someFunc()
- {
- someMutex.lock();
-
- try
- {
- manipulate(someObject);
- }
- catch(...)
- {
- someMutex.unlock();
- throw;
- }
-
- someMutex.unlock();
- }
-
-
- Having this discussion some time back on comp.lang.eiffel, people there
- actually proposed that this second version (written in Eiffel)
- was the way to go. (Eiffel, of course, doesn't even have a finalisation
- mechanism, so there is no way to write the first version in Eiffel anyway.
- At least Java has finalisation, but no guarantees when it will be called
- [or even whether it will be called, if I remember rightly].)
-
- Frankly, I am not prepared to forgo the first version without some
- extremely convincing arguments.
-
- If I wrote GUI programs, I could make the same arguments about windows. When
- I click "OK" in a dialog box, I expect the dialog box to disappear. I don't
- expect the dialog box to hang around on screen until the GC kicks in and
- reclaims the dialog box (i.e. calls its finalisation routine/destructor).
-
- Of course, GC proponents will say that you can just call the finalisation
- routine explicitly, thereby closing the dialog box or releasing the mutex.
-
- But that doesn't gain anything. If I need to do that, I might just as
- well delete my allocated objects explicitly.
-
- In the worst of all possible scenarios, you might have to call finalisation
- routines explicitly for some objects (windows, mutexes), but not for others
- (allocated memory). This opens up all sorts of possibilities for resource
- leaks and/or errors.
-
-
- What is needed is either:
-
- - Everything is collected, not just memory resources; the approach taken
- by languages like Python, I guess Lisps, and presumably, even humble
- Visual Basic. For example, how do you create and destroy windows in VB?
- Isn't it enough to just DIM a window and forget about destroying it?
- I think you can do this with OLE objects, anyway.
-
- - Nothing is collected, so the programmer knows they have to manage everything
- themselves; the approach taken by C++, C, and even Pascal, I suppose.
-
- A halfway house is just a recipe for confusion, it seems to me.
-
- To sum up:
-
- - I'm not against the concept of automatically reclaiming resources
- - I am against singling out memory as a special resource
- - I am against collecting anything less than 100% of resources
-
- Ian
-